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The concept of the reduced set of contact maps is introduced. Using this concept we find the 
ground state candidates for Hydrophobic-Polar lattice model on a two dimensional square lattice. 
Using these results we exactly enumerate the native states of aU proteins for a wide range of energy 
parameters. In this way, we show that there are some sequences, which have an absolute native 
state. Moreover, we study the scale-dependence of the number of the members of the reduced set, 
the number of ground state candidates, and the number of perfectly stable sequences by comparing 
the results for sequences with lengths of 6 up to 20. 
PACS numbers: 87.14.Ee, 87.15. Cc, 87.15.Aa, 87.15.-v 



I. INTRODUCTION 

The proteins are bio-macromolecules, which are made 
from thousands of atoms. These atoms are in interac- 
tion with each other and water molecules, which sur- 
round them. Basically, to determine the states of a pro- 
tein one needs to solve the problem with standard quan- 
tum mechanical calculations, however the complexity of 
these macromolecules renders this impossible. A feasible 
approach to this problem is based on a coarse-grained 
view. In this viewpoint the proteins are made from 20 
types of monomers (amino acids). The most important 
point in this approach is the determination of the effec- 
tive interactions between the amino acids j^. It seems 
that, the information about effective inter- monomer in- 
teraction energy and the coding of the amino acids in the 
sequence is sufficient to determine the protein character- 
istics. 

The structural information for protein structures can 
be coded in a contact map ||]. A contact map is a bi- 
nary L X L matrix C. The element Cy of this matrix 
is nonzero if ith and jth monomers are in contact. The 
contact may be defined in several ways. It is obvious 
that the information coded in contact maps is not suffi- 
cient for a complete characterization of the spatial con- 
figuration. The short-range nature of inter-monomer in- 
teractions suggests that, one can determine the config- 
uration energy in terms of contacts. Thus if one knows 
the effective inter-monomer interactions in this coarse- 
grained approximation, the contact maps have sufficient 
information to calculate the configuration energy. There 
are many papers, which study the thermodynamical and 
structural properties of proteins, by using contact maps 

It is well known that the biological functionality of 
proteins depends on the shape of their native states. The 
native structure is the unique minimum free energy struc- 
ture for the protein sequence |Q . As any protein in nature 
must have a well-defined function, the uniqueness of na- 
tive states is a biological necessity for these molecules of 
life. Thus searching the configuration space to find na- 
tive states, by using the Monte Carlo methods |^ or exact 
enumerations has been the subject of many papers. 
In most of previous works, the problem was studied for 
given values of inter-monomer energy parameters. As our 



knowledge about the effective interactions is not certain, 
and the native structures of proteins may be sensitive 
to these parameters 0, looking at the native states for 
different energy parameters is relevant PjlO|]. By using 
a simple Hydrophobic-Polar (HP) lattice model we have 
shown in a recent work that the number of ground 
state candidates for any sequence is unexpectedly small. 
This suggests that the problem can be studied for a wide 
range of interaction parameters by exact enumeration. 
We study this problem on a two dimensional square lat- 
tice. In this approach a protein structure is modeled 
by a self avoiding walk on the lattice, and, any pair of 
monomers which are nearest neighbors and are not adja- 
cent according to sequence (non-sequential neighbor) are 
in contact. 

The number of possible configurations for an L-mer is 
equal to the number of self-avoiding walks (A^sAw) with 
L — 1 steps. We have : 



NsAW 



1^1. .L 



(1) 



in which 7 is a dimension-dependent constant, and z^s is 
the effective coordination number. For a two-dimensional 
square lattice, 7 = H and z^s — 2.64 |l2|. Since many 
of these walks give the same contact matrix, the num- 
ber of possible contact matrices (physical maps) A^c, is 
much smaller, although it is still very large. In a recent 
work the number of physical maps was fit to a for- 
mula similar to equation |l| and a value of Zc = 2.29 was 
obtained. 

If one is interested only in the native structure of pro- 
teins, the set of the contact maps can be reduced further, 
by removing all maps, which have no chance to be a na- 
tive state. We call the remaining maps, the reduced set of 
contact maps. Indeed, this reduction is due to the physi- 
cal fact that all effective interactions between amino acids 
are negative . This reduced set of contact maps can be 
used in enumeration studies to find the possible ground 
states and the native states of proteins. In this paper 
we use a simple HP lattice model, to address the prob- 
lem for proteins with various lengths in more detail. We 
obtain some ground state candidates that possess some 
known properties common to real proteins. Also a sta- 
bility against the variation of interaction parameters is 
shown. Some evidences for this stability has been re- 
ported in some other works [Q. 
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II. THE REDUCED CONTACT MAPS 

The effective potential energies between the 20 types 
of amino acids can be described by a 20 x 20 interac- 
tion matrix . The energy of a given sequence crin any 
structure can be determined from 

The Cij and rrnj are respectively the elements of the con- 
tact matrix (C) and the interaction matrix (M). This 
shows that, all configurations, which have the same con- 
tact map, have equal energies. If we look at the energy 
spectrum of one sequence, the states corresponding to 
such maps are degenerate. We call such degeneracies, 
type-one degeneracies to distinguish them from other 
kinds of degeneracies, which we shall encounter later [ pT] . 
If the energy of a sequence is minimum in such states, this 
sequence does not have a unique native state. Such se- 
quences are not protein-like. The states corresponding 
to such degenerate contact maps can never be a native 
state, however, we cannot exclude them from our search, 
because they compete with other maps. On the other 
hand, there are some maps, which cannot be the ground 
state, and do not have a role in the competition for the 
ground state. To see that, consider two contact matrices 
Ci and C2 and their subtraction (C = Ci — C2). We 
call C2 , a component of Ci if all elements of C are non- 
negative {c[j = or 1). Note that C" has at least one 
non-zero element. Using equation ^, the energy of an ar- 
bitrary sequence crin the configuration(s) corresponding 
to the map C'l can be written as: 

i,3 

= E2 + ^Cijm„^a,- (3) 

According to experimental data all elements of interac- 
tion matrix M are negative Thus the second term 
in the rhs gives a negative contribution to energy, and, 
El < E2, for any sequence. Then map C2 can never be a 
ground state. One can find all component maps such as 
C2, and remove them from the set of contact maps. In- 
deed such component maps are related to configurations, 
which can fold to more compact shapes without losing 
any of their old contacts. By this procedure, a reduced 
collection of maps is found. We call this collection, the 
reduced set of contact maps, and we represent the num- 
ber of its elements by Nr. We have enumerated Nr for 
sequences with lengths up to 20. The results are shown 
in figure 1. In this figure the number of reduced maps 
(Nr) are compared with the number of self- avoiding walks 
(Nsaw) and the number of physical maps {Nc), on a two 



dimensional square lattice. Although, all of these quanti- 
ties have similar behaviors, the growth rate of Nr is very 
slower than the others. If we fit the data to equation |l| 
we obtain 7^ = 1.37 and Zr — 2.01. These results are not 
enough to see if the value of is lattice-dependent or not. 
In the case of self-avoiding walks it is lattice-independent 
p^ . In figure 1 there are other points which show the 
number of native states. We will discuss this matter in 
section 4. 

Let's consider the number of contacts {b = ^ j Cij) 
as a measure for the compactness of configurations. In- 
deed, a better parameter is the relative compactness F, 



where, ^Max is the maximum number of possible contacts 
for sequences of the same length. The maximum of con- 
tacts feiviax, for sequences of length 6, 8, 10, 12, 14, 16, 
18 and 20 are 2, 3, 4, 6, 7, 9, 10 and 12, respectively. In 
figure 2, the number of the members of the reduced set 
of contact maps vs. the number of contacts is compared 
with corresponding number of SAWs and physical maps 
for proteins of length 18. We see that the reduced set 
of contact maps contains only highly compact configura- 
tions. This shows why the results of studies on compact 
structure spaces are reasonable. In figure 3, the average 
compactness for SAWs, physical maps and reduced maps 
are compared for sequences of various lengths. As one 
can see (Fsaw) < i^c) < (Lr). There is an osciUatory 
behavior in the graphs. Note that the 6Max is an integer. 
The highest ratio of 6Max to length {L) , is for sequences 
can be fitted to a square structure. Thus, sequences with 
such lengths have lower average compactness. This is due 
to the finite size effect and also the fact that the number 
of contacts has to be an integer; the same behavior can 
be observed in our other results in this paper too. 

If one scales the number of reduced maps (Nr) by the 
number of total structures (A^^saw) at each compactness, 
a scale-independent behavior can be seen (figure 4). It 
also seems that there is a critical compactness, below 
which the compactness of the members of reduced set 
never drops. We do not have an exact analytical proof, 
but it seems from these data that a transition occurs in 
the number of reduced maps, near the compactness of 
0.8 and it vanishes for a compactness below 0.5. 

Contact maps can correspond to more than one struc- 
ture. We call such maps, degenerate maps. These maps 
can not correspond the native state of any sequence. 
Within the set of reduced maps there are fewer of such de- 
generate maps, than within the set of physical maps. Fig- 
ure 5 compares the percentage of non-degenerate maps 
for reduced and physical maps. It seems that both of 
them approach asymptotic values. 
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III. THE GROUND STATE CANDIDATES FOR 
HP MODEL 



E = —TO — a7 — bEc, 



(7) 



The native states of proteins are to be found among 
the structures corresponding to the reduced set of contact 
maps. The sequence of the amino acids along the protein 
chain and their interactions have an essential role in the 
selection of a particular structure as the native state. In 
the coarse-grained viewpoint, the the interaction between 
the amino acids is characterized by the effective energies. 
These effective interactions depend on the properties of 
the solutions. A relevant question is how sensitive the na- 
tive structures are to changes in these interactions. We 
address this question by enumerating the possible ground 
states of protein sequences for a wide range of effective 
inter-monomer interaction energies. 

Without any loss of generality, we use a hydrophobic- 
polar (HP) two-dimensional lattice model in this pa- 
per. The general form of the interactions between H and 
P monomers in an HP model can be written as follows 

Ehh = -2 - 7 - Se, 
Ehp = — 1 — Ec, 

Epp = -E,, (5) 

where E^^a' is the contact energy between monomers of 
types a and a' . These potential energies are only be- 
tween non-sequential nearest neighbors. Here 7 and Ec 
are the mixing and compactness potentials respectively, 
two parameters which are determined from experimental 
data. There are many publications based on this model, 
and in most of them the values of 7 and E^ are fixed 
1^,^. Here, we consider them as two free parameters 
and discuss our results in terms of them. 

It has been argued that the following relations should 
hold between inter-monomer energies: 

Ehh < Ehp < Epp, 

Ehh + Epp < 2Ehp- (6) 

These arguments are based on the compactness of the 
native states and some calculations on 20 x 20 inter- 
monomer interaction matrix M [p^ . These restrict 7 and 
Ec to positive values (7, Ec> 0). 

At first sight, it might seem possible to arrive at any 
native state for a given sequence by changing 7 and Ec- 
But when we consider the geometrical properties of the 
ground state, we will find that these parameters are not 
powerful enough to select any configuration as the native 
state. In other words, the native states are stable against 
the change of interaction parameters. 

If we consider H = —1 for hydrophobic monomers and 
P — Q for polar monomers, a given sequence can then 
be represented by a binary vector (cr) The energy of 
this sequence in a configuration characterized by a con- 
tact matrix C, can be written as: 



where m, a and h are three integers, related to crand C 
as follows: 



TO = 


-a' 


■C 


•1, 


a = 


2 


■C 




b = 


2 


C- 


1. 



(8) 



It can be seen that to is equal to the number of all non- 
sequential neighbors of H monomers in the configuration, 
a is the number of H — H contacts and b is the number of 
all contacts. It can be shown that the following inequal- 
ities hold between these parameters . 

TTl 

m-b<a< — <b. (9) 

Equation ^ suggests that the energy levels of a given 
sequence can be described by three integer numbers 
(to, a, 6). It is highly probable that these states are de- 
generate. There are three types of degeneracy: 

• Type 1: C = C 

In which case two or more configurations with dif- 
ferent shapes have the same contact matrix. These 
configurations will remain degenerate for any se- 
quence, and any choice of 7 and Ec- These are 
the configurations corresponding to the degenerate 
maps already mentioned in section 2. This type 
of degeneracy, is more probable for configurations 
with low compactness (see figure 2). Note that we 
are not talking about the configurations which are 
related to each other by spatial symmetries, i.e. ro- 
tation, reflection, etc., for our purpose such config- 
urations are identical. 

• Type 2: (to, a, b) = (to', a', b') but C ^ C 

In this case one particular sequence has the same 
TO, a and b values in two or more configurations. 
This degeneracy persists for any value of 7 and Ec, 
but may disappear for another sequence. Although, 
this degeneracy depends on sequence coding, the 
b = b' condition is purely geometrical, and is a nec- 
essary condition for this degeneracy. 

• Type 3: E = E' , but (to, a, b) ^ (to', a', b') 

One sequence has the same energy in two different 
states (to, a, b) and (to', a', b'), provided that 7 and 
Ec obey the following relation: 

(to - to') + {a- a')7 + {b - b')Ec = 0. (10) 

This degeneracy is related to both sequence coding 
crand inter-monomer interactions. 
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The first type of these degeneracies is completely geo- 
metric. The second one depends on both geometry and 
the amino acids' coding sequence. These two types do not 
depend on the values of the interaction energies. Thus, 
in the energy spectrum of any sequence there are some 
states, which are degenerate independently from the po- 
tential. If the ground state of a particular sequence is one 
of these degenerate states, that sequence does not have 
a unique native structure. 

The third type is not actually a degeneracy at all. 
Equation 10 corresponds to a line in the parameter space 



of Ec and 7. This line is a level crossing line. Degeneracy 
actually occurs only on the line, and a highly accurate 
fine-tuning is needed to reach a point on this line. For the 
two sets of interaction energy parameters on the two sides 
of this line, the energy ordering of the states is different. 
For any pair of states such an ordering line exists. By 
drawing all ordering lines in the space of Ec and 7, this 
space is divided into many ordering zones. We are only 
interested in the ground state, which means that many of 
these ordering lines are not relevant. Some of them only 
govern the ordering of the excited states. By removing 
the irrelevant lines, one gets a diagram which shows the 
ground state cells (Fig. 6). As mentioned before chang- 
ing the inter-monomer interaction parameters inside any 
of these cells does not change the ground state. In some 
recent works [ pO[ this picture is introduced to show the 
stability of native states against change in the interaction 
parameters They only looked at one of these cells 

in the neighborhood of some selected interaction values. 
But by looking at the whole energy space, one can find 
all possible ground states and their corresponding cells. 
Any such cell in the space of energy parameters is asso- 
ciated with one ground state candidate. The number of 
cells is equal to the number of ground state candidates 
{Gc{cr)). By drawing such diagrams, one can easily find 
the ground state for any choice of Ec and 7. Fig. 6 shows 
this diagram for a 20-mer. In this example there are only 
seven possible ground states. The cells marked with the 
numbers "1" and "2" correspond to type-1 and type-2 de- 
generate states respectively, therefore there is no unique 
native structure for these cells. The sequence in this ex- 
ample has 3 non-degenerate states. These structures are 
shown in the figure. It is possible that all the ground state 
candidates of a given sequence are degenerate. These se- 
quences constitute universally bad sequences i.e. for any 
set of interaction parameter values they do not have a na- 
tive structure. Any sequence which is not a bad sequence, 
we call a good sequence. Nearly 54% of the sequences of 
length 20 are good sequences, i.e. for some specific set of 
energy parameters they have a native state. 

The interesting point in figure 6 is that the number of 
ground state candidates is very small. The largest value 
of Gc, for sequences with length 6, 8, 10, 12, 14, 16, 18, 20 
are 1, 1, 1, 3, 4, 5, 6, 7 respectively. Fig. 7 shows the his- 
togram of Gc{cr) for all sequences with L = 20. The 
light gray area in this figure shows the result for all 2^*^ 
sequences, and the dark area shows the results for good 



ones. From this diagram it can be seen that the mean 
value of Gc(<t) is very small. The average of Gc{(t) for 
various lengths is shown in figure 8. However, the data in 
hand is not enough to draw a reliable conclusion about 
the number of ground state candidates for sequences of 
large length, but the average number does not seem to 
grow very rapidly, and the growth rate appears to be lin- 
ear. Extrapolating the growth rate to sequences of length 
200, 30 ground state candidates is predicted on average. 
Comparison of the average value of Gc{(t) for these se- 
quences with the number of all configurations (i.e. for 
sequences with length 20 the number of sequences is on 
the order of 10^), shows that the geometric constraints 
play an important role in selecting a state as the ground 
state. The reason that there are few ground state can- 
didates for any sequence can be given by a geometrical 
argument [pT| . This argument shows that the upper es- 
timate for maximum Gc is L^. 

As figure 8 shows, there are some good sequences with 
Gc = 1 . This means that for any set of energy parameter 
values, they have the same unique ground state. Fig. 9 
shows some of these sequences and their unique native 
structures. Indeed the native states of these sequences 
have perfect stability with respect to a change of the en- 
ergy parameters. Our enumeration shows that these ab- 
solute native structures are to be found among the most 
compact structures. As figure 10 shows, although the 
ratio of the number of perfectly stable sequences to the 
number of all possible proteins decreases with increas- 
ing L, their actual number increases. This suggests that 
for the proteins with typical lengths near that of natu- 
ral proteins, perfectly stable sequences constitute a small 
but non-zero fraction of all possible sequences. A relevant 
question is whether the existence of these perfectly sta- 
ble sequences is due to the simplifications in our model. 
Actually we can not give an exact answer to this ques- 
tion, but such sequences may exist in models with more 
monomer types. 

The existence of these sequences may answer some 
questions about protein folding. Their number is small 
compared with the huge number of the possible amino- 
acids sequences, their native states are highly compact 
and are stable against the changes in the inter-monomer 
interactions (i.e the properties of the solution). 



IV. NATIVE STRUCTURES 

In section 2 we introduced the reduced set of contact 
maps. As it was shown the number of maps belonging 
to this set Nr, is very smaller than number of structures 
NsAW- But the number of those structures which can be 
the native state, is still much less. The number of pos- 
sible native structures, -/Vnativc, is shown in figure 1. In 
this figure all those structures which have been the native 
state of some sequence for at least one set of energy pa- 
rameter values, have been counted. Fitting the data on 
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a equation similar to equation gives 7nativc = 1-87 and 
^native = 1-68. In figure 2, we have compared the number 
of native structures as a function of their compactness 
with the total number of physical maps and with the 
number of maps in the reduced set for L — 18. It can be 
seen that there are no native structures with less than 8 
contacts. Also the average compactness of native states 
is compared in figure 3. 

We can introduce a designability parameter D for these 
native states. However our definition is a bit different 
from the commonly used definition According to 
the common definition, designability shows how many 
times a structure is selected as the native state for a 
fixed set of interaction parameters. In our case we count 
how many times a structure becomes the candidate for a 
non-degenerate ground state. 

Figure 11 shows the histogram of designability for 
structures with length 20. As one can see the results are 
very similar to those for a fixed set of energy parameters 
in the space of compact structures The average 

designability as a function of compactness for L = 20 is 
shown in figure 12. As the diagram shows the peak aver- 
age designability occurs for the most compact structures 
and it falls sharply with decreasing compactness. Thus if 
one is only interested in highly designable structures, it 
is reasonable to search the space of compact structures. 



V. THE SPACE OF ENERGY PARAMETERS, Ec 
AND 7 

One of the important aspects of the work done in this 
paper, is that we can find the exact results for any range 
of energy parameters. The time it takes for this program 
to find the ground state candidates for all sequences by 
exact enumeration, is on the same order as that of the 
usual enumeration schemes for only one particular set 
of energy parameters. Because the average number of 
ground state candidates is very small, the determination 
of the native ground states for any range of interest only 
takes a little time. We found the native states of all se- 
quences of length 20, for all pairs of energy parameters 
within a 12 X 12 square in arbitrary units, with a grid 
size of 0.1 (14400 points). The number of protein-like 
sequences (sequences which have unique ground states) 
is shown in figure 13. As one can see, there are jumps in 
the number of protcin-like sequences. These jumps spec- 
ify the borders of regions of relative stability within the 
space of energy parameters. A closer examination shows 
that these border lines contain sharp dips adjacent to the 
jumps. The large changes in the number of protein-like 
sequences shows that when we cross these borders the 
ground states of many sequences change, and the degen- 
erate ground states are replaced by non-degenerate ones 
(or vice versa). However, nothing can be said about the 
details of these changes. One can get some idea about 
what is happening on these border lines by comparing 



the contour plot for figure 13. a (figure 13. b) with the or- 
dering lines diagram for one particular sequence (figure 
6). As mentioned in section 3, the ordering lines specify 
level crossings and type-3 degeneracies only occur on the 
ordering line itself. These ordering lines constitute the 
underlying cause of the sharp dips observed in the bor- 
ders. This is more evident in figure 14. In this figure we 
have shown those points in the energy parameter space 
where at least one type-3 degeneracy occurs. This dia- 
gram is in fact a superposition of diagrams like figure 6, 
for all sequences, and any line in it corresponds to many 
ordering lines between ground state candidates cells. 

We can find similar information for other types of de- 
generacies. For example, the number of sequences which 
have type-1 degenerate ground states, is shown in fig- 
ure 15. As one can see in this diagram, the number of 
such sequences vanish for large and small 7. For large 
Ec the number of contacts b plays an essential role in 
the selection of the ground state (equation Type-1 
degeneracies do not occur for highly compact sequences 
(see figure 2). Thus this type of degeneracy is more rel- 
evant in the region Ec < j. We have not shown the cor- 
responding information for type-2 degeneracies as they 
contain no new information, similar border jumps can be 
observed in the number of sequences with this type of 
degeneracy too. The maximum percentage of sequences 
with non-degenerate, type-1 degenerate, and type-2 de- 
generate ground states in the chosen region are 40.0%, 
5.06% and 64.9%, respectively. 

In addition to obtaining information about the se- 
quences, with this procedure also finds the ground states. 
Since the energy parameters determine which states are 
the ground states, the number of structures which can be 
the native state of some particular sequence also depends 
on the energy parameters. Figure 16 shows the number 
of native states as a function of the energy parameters. 
The importance of compactness at for large values of Ec 
can also be seen in this diagram. Note that the smallest 
value for the number of native states is 503. This number 
corresponds to the number of most compact structures of 
length 20. Again, large jumps in the number of native 
states are observed. One can also find the average des- 
ignability of the structures by dividing the data of figures 
13 and 16 (the ratio of the number of sequences to cor- 
responding number of native structures). 



VI. CONCLUSION 

Due to the short-range nature of inter-monomer in- 
teractions, the configuration energy of protein sequences 
can be determined by using configuration contact matri- 
ces. In this paper, it has been shown that for this class 
of problems, where one is interested in native states of 
proteins, the space of physical contact maps can be re- 
duced to a very smaller set by removing all irrelevant 
maps. We have found the reduced set of contact maps 
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for sequences of lengths up to 20 in this paper by exact 
enumeration. This reduced set of contact maps shows a 
scale-independent behavior as shown in figure 4. 

Using the reduced set of contact maps, the ground 
state candidates for all sequences were found in the HP 
model. The number of these ground state candidates 
is quite small. The ground state candidates divide the 
space of energy parameters into several cells. By finding 
this cell structure for all sequences, we have found the 
native states for all sequences of different lengths, for a 
wide range of energy parameters. Jumps are observed in 
the number of protein-like sequences. These jumps are 
related to boundaries of the aforementioned cells. 

Another interesting result is that wc find some se- 



quences with absolute native states i.e. their native states 
are not sensitive to the values of energy parameters. Our 
results show that the number of such perfectly stable 
sequences grows with length, however, their percentage 
decreases. 

Because the key tool used in this paper has been the 
structural information contained in the contact maps, the 
qualitative results can be generalized to all contact mod- 
els, regardless of the details of the lattice and the contact 
rules. 
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Figure Captions 



Figure 1. 

The number of self avoiding walk structures, physical contact maps, reduced set of contact maps and 
native structures, vs. length of sequences. 
Figure 2. 

Distribution of the number of structures vs. the number of contacts for sequences of length 18. 
Figure 3. 

The average compactness of structures for SAW, physical maps, reduced maps and native structures. 
Figure 4. 

The number of reduced maps that scaled by the number of all structures at each compactness, for 
sequences with length 8 to 20. There is a transition near to 0.8 and a cut off near to 0.5. The later can 
bo scon better in logarithmic scale (inner graph). 
Figure 5. 

The percentage of non-degenerate maps for reduced and physical maps. 
Figure 6. 

The space of energy parameters for sequence HPPPHPHPHPPHPHPHPHHP is divided to six cells. 

The integer numbers (m, a, 6), inside any cell indicate the ground state corresponding to the cells. Three 
of these states are degenerate. The types of degeneracies for degenerate states and shape of structures 
for non-degenerates are indicated in the cells. 
Figure 7. 

The histogram of the number of ground state candidates for 20-mers. The light and dark gray area 
show the results for all and good sequences respectively. There are some "good sequences" with only one 
ground state candidate. 
Figure 8. 

The average of the number of ground state candidates for all and good sequences vs. length of sequences. 
Figure 9. 

Four example for perfectly stable sequences and their absolute native structures. For any positive value 
of 7 and Ec these sequences are folded uniquely in the shown structures. 
Figure 10. 

The ratio of the numbers of perfectly stable sequences to all sequences decreases with length of sequences, 
but their absolute numbers increase (inner graph). 
Figure 11. 

The histogram of number of structures with a given designability. 
Figure 12. 

The average designability for structures with a given number of contacts, for L = 20. 
Figure 13. 

The number of protein-like sequences of length 20, for given values of energy parameters in a 12 x 12 
square region (arbitrary units); a) Three dimensional plot, b) Contour plot. 
Figure 14. 

The points in the energy parameter space (arbitrary units), where type-3 degeneracies occur, for sequences 
of length 20. The grid size is 0.1. 
Figure 15. 

The number of sequences of length 20, with type-1 degenerate ground states, for given values of energy 
parameters in a 12 x 12 square region (arbitrary units). 
Figure 16. 

The number of native states for sequences of length 20, for given values of energy parameters in a 12 x 12 
square region (arbitrary units). 
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